loop invariant
Learning Loop Invariants for Program Verification
The problem is undecidable and even practical instances are challenging. Inspired by how human experts construct loop invariants, we propose a reasoning framework Code2Inv that constructs the solution by multi-step decision making and querying an external program graph memory block. By training with reinforcement learning, Code2Inv captures rich program features and avoids the need for ground truth solutions as supervision. Compared to previous learning tasks in domains with graph-structured data, it addresses unique challenges, such as a binary objective function and an extremely sparse reward that is given by an automated theorem prover only after the complete loop invariant is proposed. We evaluate Code2Inv on a suite of 133 benchmark problems and compare it to three state-of-the-art systems. It solves 106 problems compared to 73 by a stochastic search-based system, 77 by a heuristic search-based system, and 100 by a decision tree learning-based system. Moreover, the strategy learned can be generalized to new programs: compared to solving new instances from scratch, the pre-trained agent is more sample efficient in finding solutions.
AutoICE: Automatically Synthesizing Verifiable C Code via LLM-driven Evolution
Luo, Weilin, Liang, Xueyi, Deng, Haotian, Liu, Yanan, Wan, Hai
Automatically synthesizing verifiable code from natural language requirements ensures software correctness and reliability while significantly lowering the barrier to adopting the techniques of formal methods. With the rise of large language models (LLMs), long-standing efforts at autoformalization have gained new momentum. However, existing approaches suffer from severe syntactic and semantic errors due to the scarcity of domain-specific pre-training corpora and often fail to formalize implicit knowledge effectively. In this paper, we propose AutoICE, an LLM-driven evolutionary search for synthesizing verifiable C code. It introduces the diverse individual initialization and the collaborative crossover to enable diverse iterative updates, thereby mitigating error propagation inherent in single-agent iterations. Besides, it employs the self-reflective mutation to facilitate the discovery of implicit knowledge. Evaluation results demonstrate the effectiveness of AutoICE: it successfully verifies $90.36$\% of code, outperforming the state-of-the-art (SOTA) approach. Besides, on a developer-friendly dataset variant, AutoICE achieves a $88.33$\% verification success rate, significantly surpassing the $65$\% success rate of the SOTA approach.
- Europe (0.04)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- Africa > Senegal > Dakar Region > Dakar (0.04)
Learning Loop Invariants for Program Verification
The problem is undecidable and even practical instances are challenging. Inspired by how human experts construct loop invariants, we propose a reasoning framework Code2Inv that constructs the solution by multi-step decision making and querying an external program graph memory block. By training with reinforcement learning, Code2Inv captures rich program features and avoids the need for ground truth solutions as supervision. Compared to previous learning tasks in domains with graph-structured data, it addresses unique challenges, such as a binary objective function and an extremely sparse reward that is given by an automated theorem prover only after the complete loop invariant is proposed. We evaluate Code2Inv on a suite of 133 benchmark problems and compare it to three state-of-the-art systems. It solves 106 problems compared to 73 by a stochastic search-based system, 77 by a heuristic search-based system, and 100 by a decision tree learning-based system. Moreover, the strategy learned can be generalized to new programs: compared to solving new instances from scratch, the pre-trained agent is more sample efficient in finding solutions.
- North America > United States > Pennsylvania (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Towards General Loop Invariant Generation: A Benchmark of Programs with Memory Manipulation Anonymous Author(s) Affiliation Address email 1 Overview of Supplementary Material
Dataset Documentation: We have documented our dataset for intended researchers as required. The link to download the models after fine-tuning is https://mega.nz/file/M9FEWCjD# To fill the lack of benchmarks for general loop invariant generation, we propose LIG-MM, a loop invariant generation benchmark of memory manipulation programs. Table 1 below shows the basics of the code in LIG-MM. Multiple examples are shown in Sec. 3, and the Table 1: Statistics of our proposed LIG-MM benchmark.
- Europe > United Kingdom > North Sea > Central North Sea (0.04)
- North America > United States > Florida > Pinellas County > St. Petersburg (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- (5 more...)
- Europe > United Kingdom > North Sea > Central North Sea (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Florida > Pinellas County > St. Petersburg (0.04)
- (7 more...)
- Instructional Material > Course Syllabus & Notes (0.67)
- Research Report > New Finding (0.46)
- North America > United States > Pennsylvania (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- South America > Brazil > Santa Catarina > Florianópolis (0.04)
- North America > United States > Texas > Harris County > Houston (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Europe > Germany (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Canada > British Columbia > Vancouver (0.04)
- (12 more...)
- Research Report (0.46)
- Overview (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
- (2 more...)
LLM For Loop Invariant Generation and Fixing: How Far Are We?
Akhond, Mostafijur Rahman, Chakraborty, Saikat, Uddin, Gias
A loop invariant is a property of a loop that remains true before and after each execution of the loop. The identification of loop invariants is a critical step to support automated program safety assessment. Recent advancements in Large Language Models (LLMs) have demonstrated potential in diverse software engineering (SE) and formal verification tasks. However, we are not aware of the performance of LLMs to infer loop invariants. We report an empirical study of both open-source and closed-source LLMs of varying sizes to assess their proficiency in inferring inductive loop invariants for programs and in fixing incorrect invariants. Our findings reveal that while LLMs exhibit some utility in inferring and repairing loop invariants, their performance is substantially enhanced when supplemented with auxiliary information such as domain knowledge and illustrative examples. LLMs achieve a maximum success rate of 78\% in generating, but are limited to 16\% in repairing the invariant.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > Georgia > Chatham County > Savannah (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (4 more...)